Visually-moderated phonetic context effects 1 Running Head: VISUAL INFORMATION AND PHONETIC CONTEXT EFFECTS A Critical Evaluation of Visually-Moderated Phonetic Context Effects

نویسندگان

  • Lori L. Holt
  • Joseph D. W. Stephens
  • Andrew J. Lotto
چکیده

Fowler, Brown and Mann (2000) report a visually-moderated phonetic context effect in which a video disambiguates an acoustically ambiguous precursor syllable which, in turn, influences perception of a following syllable. The present experiments explore this finding and claims that stem from it. Experiment 1 failed to replicate Fowler et al. with novel materials modeled after the original study, but Experiment 2 successfully replicated the effect using Fowler et al.’s stimulus materials. This discrepancy was investigated in Experiments 3 and 4, which demonstrate that variation in visual information concurrent with the test syllable is sufficient to account for the original results. The Fowler et al. visuallymoderated phonetic context effect appears to have been a demonstration of audiovisual interaction between concurrent stimuli and not an effect whereby preceding visual information elicits changes in the perception of subsequent speech sounds. Visually-moderated phonetic context effects 3 A Critical Evaluation of Visually-Moderated Phonetic Context Effects Speech perception makes use of both auditory and visual information. McGurk and MacDonald (1976) demonstrated a potent and reliable effect in which mismatches between auditory and visual speech cues lead to perceptual identifications that would not be obtained based on the information presented to either modality alone. This effect has been replicated many times (see Colin & Radeau, 2003 for review) and is not disrupted by the perceiver’s awareness of the audiovisual mismatch or on the perceiver’s direction of attention to one modality or the other (Massaro, 1987). Fowler, Brown, and Mann (2000) recently reported a visual influence on phonetic identification using an auditory/visual paradigm first described by Vroomen (1992, see also Green & Norrix, 2001). As implemented by Fowler et al., this paradigm takes advantage of a well-established demonstration of the context-dependent nature of speech perception (Mann, 1980) whereby listeners’ identification of syllables ranging perceptually from /ga/ to /da/ is examined in the context of a preceding /al/ or /ar/ syllable. Perception is context dependent in that test syllables are more often identified as /ga/ when they are preceded by /al/. Listeners identify the same syllables more often as /da/ when they are preceded by /ar/. Fowler et al. investigated whether visual stimuli may moderate such effects of context on following syllables. Specifically, Fowler et al. created an acoustic stimulus that they judged to be perceptually ambiguous between /al/ and /ar/. This Visually-moderated phonetic context effects 4 stimulus served as the first-syllable soundtrack for a video of a male speaker articulating /alda/ or /arda/. Although the acoustically-ambiguous precursor remained constant across conditions and only the visual information varied to disambiguate the precursor, Fowler et al. (2000) observed a context effect on listeners’ perception of subsequent acoustic syllables drawn from an acoustic /ga/ to /da/ series. Listeners identified test syllables more often as “ga” when visual information accompanying the precursor indicated an /al/ than when visual information indicated an /ar/. The observation of a visually-moderated phonetic context effect indicates that auditory and visual information may be integrated to shift speech identification. At the most general level, this finding is not novel. In the classic McGurk effect, an audio soundtrack of /ga/ presented with a video stimulus of a face articulating /ba/ interact to create a perceived /da/. What is different about the Fowler et al. (2000) study is that visual information paired with the precursor modulates precursor perception which, in turn, may produce a shift in the identification of subsequent phonetic segments. This sort of context effect can be considered a “second order” or “indirect” effect of visual context. Similar secondorder effects have been observed for other sources of information. Elman and McClelland (1988), for example, demonstrated that lexical information may serve to disambiguate an acoustically-ambiguous fricative between /s/ and /S/ and, consequently, produce a context effect on perception of a member of a following Visually-moderated phonetic context effects 5 /t/-/k/ series (see also Pitt & McQueen, 1998; Magnuson, McMurray, Tanenhaus, & Aslin, 2003; Samuel & Pitt, 2003). The Fowler et al. paradigm is similar to that developed by Elman and McClelland in that it demonstrates that context effects can occur in the absence of acoustic change across precursor contexts. Apparently, either visual or lexical information can influence processing of an acoustically ambiguous syllable and thereby affect perception of a following syllable. Since the Fowler et al. (2000) study, another investigation of visual moderation of phonetic context effects has been reported. In a study with methods very much like those of Fowler et al., Vroomen and de Gelder (2001) found little evidence of visually-moderated phonetic context effects. With unimodal acoustic stimuli, a preceding /s/ leads to more “ka” identifications for a /ta/ to /ka/ test syllable series than does a preceding /S/ (Mann & Repp, 1981). Vroomen and de Gelder investigated whether an ambiguous acoustic fricative between /s/ and /S/ would produce a phonetic context effect on a following /ta/ to /ka/ series when the fricative was disambiguated by video of a speaker saying /aska/ or /aSka/. Although the video served to reliably influence participants’ identification of the fricative, the resulting cross-modal percept did not shift identification of the following test syllable. This finding persisted even when reduced amplitude of the acoustic stimuli encouraged listeners to rely more on visual information. Visually-moderated phonetic context effects 6 Thus, although Vroomen and de Gelder (2001) and Fowler et al. (2000) employed very similar methods to examine the indirect, or second-order, auditory-visual context effects, their results were strikingly different. Whereas Fowler et al. observed a significant visually-moderated context effect on subsequent speech identification, Vroomen and de Gelder found no evidence of such effects. Given the close similarity of the two studies, the discrepancy in their findings is puzzling. Indirect context effects arising from disambiguating lexical information have been very influential in understanding the dynamics of processing in spoken word recognition (e.g., Elman & McClelland, 1988; Pitt & McQueen, 1998; Magnuson et al., 2003; Samuel & Pitt, 2003). Indirect effects arising from visual information could similarly inform models of speech perception and spoken word recognition. Thus, we undertook to replicate the visually-moderated phonetic context effect reported by Fowler et al. in an attempt to determine the variables that are responsible for the presence or absence of these effects. Experiment 1 In Experiment 1, we attempted to replicate the visually-moderated context effect reported by Fowler et al. (2000) using novel stimulus materials based on the Fowler et al. stimulus description. Visually-moderated phonetic context effects 7

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A critical evaluation of visually moderated phonetic context effects.

Fowler, Brown, and Mann (2000) have reported a visually moderated phonetic context effect in which a video disambiguates an acoustically ambiguous precursor syllable, which, in turn, influences perception of a subsequent syllable. In the present experiments, we explored this finding and the claims that stem from it. Experiment 1 failed to replicate Fowler et al. with novel materials modeled aft...

متن کامل

Acoustic and visual phonetic features in the mcgurk effect - an audiovisual speech illusion

Information from the acoustic speech signal and the talking face is integrated into a unified percept. This is demonstrated in the McGurk effect, in which discrepant visual articulation changes the auditory perception of a consonant. We studied acoustic (A) and visual (V) phonetic features that contribute to audiovisual speech perception by measuring the McGurk effect in two vowel contexts, [a]...

متن کامل

Spoken word identification by native and nonnative speakers of English: effects of training, modality, context and phonetic environment

Several experiments explored the contribution of visual information (lip movements) to spoken word identification by Japanese and Korean learners of English as a second language (ESL) and native speakers (NSs) of English, and its interaction with sentence context, phonetic environment and, for ESL learners, perceptual training (involving /m,l,p,f,},s/) using

متن کامل

Lexically guided retuning of visual phonetic categories.

Listeners retune the boundaries between phonetic categories to adjust to individual speakers' productions. Lexical information, for example, indicates what an unusual sound is supposed to be, and boundary retuning then enables the speaker's sound to be included in the appropriate auditory phonetic category. In this study, it was investigated whether lexical knowledge that is known to guide the ...

متن کامل

Speaker recognition by means of acoustic and phonetically informed GMMs

In this work we assess the recently proposed hybrid Deep Neural Network/Gaussian Mixture Model (DNN/GMM) approach for speaker recognition considering the effects of the granularity of the phonetic DNN model, and of the precision of the corresponding GMM models, which will be referred to as the phonetic GMMs. The aim of this work is to better understand the contributions of the phonetic informat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004